The initial time Span of auditory processing used for speaker attribution of the speech signal

نویسندگان

V. Lublinskaja

Christian Sappok

چکیده

Research on the temporal organisation of speech perception is focussed mostly on the linguistic categories of the input. What is the role of non-grammatical categories for this processes? What kind of mechanisms integrate both kinds of features within the online process of perception? Individual voice qualities and the position of the sentence within the text were chosen to test the time interval where decisions as to speaker belongingness are made. The results favour a model with a relatively fixed time span within which a familiar voice or a deviation from an inherent context expectancy are detected. Whereas research on the time organisation of speech perception is focused on the processes of phoneme identification or lexical access, few is known about the timing of the auditory processing of so-called extralinguistic factors as, for example, speakers' voice which is known to play an important role in the recognition of speech. A first attempt to procede in this direction was made by Lublinskaja , Sappok 1996. The task of our present work is to investigate the temporal side of processing speech signals when a listener has to ascribe them to some familiar speaker. Being involved in the process of modelling the discourse situation he has to trace a target voice within sequences of sentences spoken by different speakers. Two questions have to be answered: (1) How long is the initial time span subjects need to ascribe sentences to a target voice? 2) What is the nature of this interval: does it depend on the specific features of the acoustic events? Or is it a standard rate of scanning the results of auditory input as represented in memory? Hypotheses on this field have mostly been discussed in connection with phoneme identification (Chistovich 1984, Massarro 1972) or with lexical access (Marslen-Wilson 1985) paying little attention to online processing of speakers´voice characteristics. A set of sentences from the Acoustic Data Base of the Saint-Petersburg University was used as speech material [2]. Two main tests preceded by two preparatory tests (cf. below) were prepared: The first contained a sequence of 28 sentences spoken by two alternating voices (female and male). In the second test 31 sentences where spoken by four speakers: two female and two male. In both tests one of the female voices was chosen to be the familiar voice being the object of training procedures, the other voices being unfamiliar. The familiar voice predominated over …

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Comparative Study of Gender and Age Classification in Speech Signals

Accurate gender classification is useful in speech and speaker recognition as well as speech emotion classification, because a better performance has been reported when separate acoustic models are employed for males and females. Gender classification is also apparent in face recognition, video summarization, human-robot interaction, etc. Although gender classification is rather mature in a...

متن کامل

شبکه عصبی پیچشی با پنجره‌های قابل تطبیق برای بازشناسی گفتار

Although, speech recognition systems are widely used and their accuracies are continuously increased, there is a considerable performance gap between their accuracies and human recognition ability. This is partially due to high speaker variations in speech signal. Deep neural networks are among the best tools for acoustic modeling. Recently, using hybrid deep neural network and hidden Markov mo...

متن کامل

Effects of ageing on speed and temporal resolution of speech stimuli in older adults

Background: According to previous studies, most of the speech recognition disorders in older adults are the results of deficits in audibility and auditory temporal resolution. In this paper, the effect of ageing on timecompressed speech and auditory temporal resolution by word recognition in continuous and interrupted noise was studied. Methods: A time-compressed speech test (TCST) w...

متن کامل

Attribution Bias in schizophrenian patients who have auditory hallucination

Introduction: Concerning cognitivism, psychotic experiences (hallucination) of schizophrenic patiets have been hypothesized to originate from a fundamentally cognitive biases. Methods: To explor the idea that attribution bias may underlin appearance of auditory hallucination, in the current descriptive study, a source-monitoring task were used to compare healthy controles with relatives of indi...

متن کامل

Persian Phone Recognition Using Acoustic Landmarks and Neural Network-based variability compensation methods

Speech recognition is a subfield of artificial intelligence that develops technologies to convert speech utterance into transcription. So far, various methods such as hidden Markov models and artificial neural networks have been used to develop speech recognition systems. In most of these systems, the speech signal frames are processed uniformly, while the information is not evenly distributed ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 1997

The initial time Span of auditory processing used for speaker attribution of the speech signal

نویسندگان

چکیده

منابع مشابه

A Comparative Study of Gender and Age Classification in Speech Signals

شبکه عصبی پیچشی با پنجره‌های قابل تطبیق برای بازشناسی گفتار

Effects of ageing on speed and temporal resolution of speech stimuli in older adults

Attribution Bias in schizophrenian patients who have auditory hallucination

Persian Phone Recognition Using Acoustic Landmarks and Neural Network-based variability compensation methods

عنوان ژورنال:

اشتراک گذاری